Overview

Dataset statistics

Number of variables 12
Number of observations 20747
Missing cells 154467
Missing cells (%) 62.0%
Duplicate rows 0
Duplicate rows (%) 0.0%
Total size in memory 1.9 MiB
Average record size in memory 96.0 B

Variable types

DateTime 1
Categorical 2
Numeric 8
Unsupported 1

Dataset

Description Returns the geocoordinates of where the phone is located To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
Creator Matteo Busso, Massimo Stefan
Author Fausto Giunchiglia, Ivano Bison, Matteo Busso, Ronald Chenu-Abente, Marcelo Rodas Britez, Can Gunel, Giuseppe Veltri, Amalia de Götzen, Peter Kun, Amarsanaa Ganbold, Altangerel Chagnaa, George Gaskell, Miriam Bidoglia, Luca Cernuzzi, Alethia Hume, Jose Luis Zarza, Daniele Miorandi, Carlo Caprini
URL
Copyright (c) KnowDive 2022

Variable descriptions

experimentId Experiment Id
userId User id
timestamp show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
day day showing month(2), day(2)
accuracy The GPS accuracy in meters
lucene NO DESCRIPTION
provider It indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS
speed The speed of the device, measured in meters/second over ground
bearing The compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)
latitude Geographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.
longitude Geographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation
altitude Elevation above sea level in meters.

Alerts

experimentId has constant value "wenet" Constant
userId is highly correlated with latitude and 1 other fields High correlation
day is highly correlated with latitude and 1 other fields High correlation
speed is highly correlated with bearing and 1 other fields High correlation
bearing is highly correlated with speed and 1 other fields High correlation
latitude is highly correlated with userId and 2 other fields High correlation
longitude is highly correlated with userId and 2 other fields High correlation
altitude is highly correlated with speed and 1 other fields High correlation
userId is highly correlated with latitude and 1 other fields High correlation
day is highly correlated with altitude High correlation
latitude is highly correlated with userId and 1 other fields High correlation
longitude is highly correlated with userId and 1 other fields High correlation
altitude is highly correlated with day High correlation
speed is highly correlated with bearing and 1 other fields High correlation
bearing is highly correlated with speed and 1 other fields High correlation
latitude is highly correlated with longitude High correlation
longitude is highly correlated with latitude High correlation
altitude is highly correlated with speed and 1 other fields High correlation
experimentId is highly correlated with provider High correlation
provider is highly correlated with experimentId High correlation
userId is highly correlated with latitude and 1 other fields High correlation
day is highly correlated with longitude and 1 other fields High correlation
provider is highly correlated with latitude High correlation
speed is highly correlated with bearing High correlation
bearing is highly correlated with speed and 1 other fields High correlation
latitude is highly correlated with userId and 3 other fields High correlation
longitude is highly correlated with userId and 2 other fields High correlation
altitude is highly correlated with day and 2 other fields High correlation
experimentId has 13372 (64.5%) missing values Missing
userId has 13372 (64.5%) missing values Missing
day has 13372 (64.5%) missing values Missing
accuracy has 13372 (64.5%) missing values Missing
lucene has 20747 (100.0%) missing values Missing
provider has 13372 (64.5%) missing values Missing
speed has 13372 (64.5%) missing values Missing
bearing has 13372 (64.5%) missing values Missing
latitude has 13372 (64.5%) missing values Missing
longitude has 13372 (64.5%) missing values Missing
altitude has 13372 (64.5%) missing values Missing
timestamp has unique values Unique
lucene is an unsupported type, check if it needs cleaning or further analysis Unsupported
userId has 1026 (4.9%) zeros Zeros
speed has 951 (4.6%) zeros Zeros
bearing has 924 (4.5%) zeros Zeros

Reproduction

Analysis started 2022-07-04 18:03:15.436189
Analysis finished 2022-07-04 18:03:33.676164
Duration 18.24 seconds
Software version pandas-profiling v3.2.0
Download configuration config.json

Variables

timestamp
Date

UNIQUE

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct 20747
Distinct (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory size 162.2 KiB
Minimum 1900-11-22 03:20:00
Maximum 1900-12-06 13:06:00
2022-07-04T20:03:33.823981 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:34.140554 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

experimentId
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Experiment Id

Distinct 1
Distinct (%) < 0.1%
Missing 13372
Missing (%) 64.5%
Memory size 162.2 KiB
wenet
7375

Length

Max length 5
Median length 5
Mean length 5
Min length 5

Characters and Unicode

Total characters 36875
Distinct characters 4
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row wenet
2nd row wenet
3rd row wenet
4th row wenet
5th row wenet

Common Values

Value Count Frequency (%)
wenet 7375
35.5%
(Missing) 13372
64.5%

Length

2022-07-04T20:03:34.420285 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:03:34.634516 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
wenet 7375
100.0%

Most occurring characters

Value Count Frequency (%)
e 14750
40.0%
w 7375
20.0%
n 7375
20.0%
t 7375
20.0%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 36875
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 14750
40.0%
w 7375
20.0%
n 7375
20.0%
t 7375
20.0%

Most occurring scripts

Value Count Frequency (%)
Latin 36875
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
e 14750
40.0%
w 7375
20.0%
n 7375
20.0%
t 7375
20.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 36875
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 14750
40.0%
w 7375
20.0%
n 7375
20.0%
t 7375
20.0%

userId
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

User id

Distinct 6
Distinct (%) 0.1%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 6.688542373
Minimum 0
Maximum 14
Zeros 1026
Zeros (%) 4.9%
Negative 0
Negative (%) 0.0%
Memory size 162.2 KiB
2022-07-04T20:03:34.788477 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 0
Q1 4
median 4
Q3 13
95-th percentile 13
Maximum 14
Range 14
Interquartile range (IQR) 9

Descriptive statistics

Standard deviation 5.054538222
Coefficient of variation (CV) 0.7557010093
Kurtosis -1.548913343
Mean 6.688542373
Median Absolute Deviation (MAD) 4
Skewness 0.2985465925
Sum 49328
Variance 25.54835664
Monotonicity Not monotonic
2022-07-04T20:03:34.966034 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
Value Count Frequency (%)
4 3364
16.2%
13 2489
12.0%
0 1026
4.9%
1 249
1.2%
14 183
0.9%
11 64
0.3%
(Missing) 13372
64.5%
Value Count Frequency (%)
0 1026
4.9%
1 249
1.2%
4 3364
16.2%
11 64
0.3%
13 2489
12.0%
14 183
0.9%
Value Count Frequency (%)
14 183
0.9%
13 2489
12.0%
11 64
0.3%
4 3364
16.2%
1 249
1.2%
0 1026
4.9%

day
Real number (ℝ ≥0 )

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

day showing month(2), day(2)

Distinct 12
Distinct (%) 0.2%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 1146.970305
Minimum 1122
Maximum 1206
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.2 KiB
2022-07-04T20:03:35.377885 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1122
5-th percentile 1122
Q1 1123
median 1127
Q3 1201
95-th percentile 1202
Maximum 1206
Range 84
Interquartile range (IQR) 78

Descriptive statistics

Standard deviation 35.0923121
Coefficient of variation (CV) 0.03059565879
Kurtosis -1.139959954
Mean 1146.970305
Median Absolute Deviation (MAD) 4
Skewness 0.9186766819
Sum 8458906
Variance 1231.470368
Monotonicity Increasing
2022-07-04T20:03:35.565828 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
Value Count Frequency (%)
1201 1303
6.3%
1123 1277
6.2%
1124 1139
5.5%
1122 979
4.7%
1127 900
4.3%
1202 535
2.6%
1128 460
2.2%
1126 270
1.3%
1130 213
1.0%
1205 182
0.9%
Other values (2) 117
0.6%
(Missing) 13372
64.5%
Value Count Frequency (%)
1122 979
4.7%
1123 1277
6.2%
1124 1139
5.5%
1126 270
1.3%
1127 900
4.3%
1128 460
2.2%
1130 213
1.0%
1201 1303
6.3%
1202 535
2.6%
1204 57
0.3%
Value Count Frequency (%)
1206 60
0.3%
1205 182
0.9%
1204 57
0.3%
1202 535
2.6%
1201 1303
6.3%
1130 213
1.0%
1128 460
2.2%
1127 900
4.3%
1126 270
1.3%
1124 1139
5.5%

accuracy
Real number (ℝ ≥0 )

MISSING

The GPS accuracy in meters

Distinct 4145
Distinct (%) 56.2%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 23.21014013
Minimum 1
Maximum 1299.999
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 162.2 KiB
2022-07-04T20:03:35.813915 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum 1
5-th percentile 11.38
Q1 12.7585
median 13.973
Q3 20
95-th percentile 37.7426756
Maximum 1299.999
Range 1298.999
Interquartile range (IQR) 7.2415

Descriptive statistics

Standard deviation 74.76384356
Coefficient of variation (CV) 3.221171571
Kurtosis 181.9169187
Mean 23.21014013
Median Absolute Deviation (MAD) 1.846
Skewness 13.2748485
Sum 171174.7835
Variance 5589.632304
Monotonicity Not monotonic
2022-07-04T20:03:36.115337 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
20 831
4.0%
26.4 41
0.2%
23.6 40
0.2%
4 40
0.2%
8 35
0.2%
28.1 34
0.2%
24.9 32
0.2%
6 27
0.1%
12 26
0.1%
20.1 23
0.1%
Other values (4135) 6246
30.1%
(Missing) 13372
64.5%
Value Count Frequency (%)
1 1
< 0.1%
1.5 1
< 0.1%
2 3
< 0.1%
2.3 1
< 0.1%
2.4 1
< 0.1%
2.6 1
< 0.1%
2.8 1
< 0.1%
3 5
< 0.1%
3.6 1
< 0.1%
3.7 1
< 0.1%
Value Count Frequency (%)
1299.999 1
< 0.1%
1200 2
< 0.1%
1100 17
0.1%
1049 1
< 0.1%
1000 9
< 0.1%
899.999 5
< 0.1%
800 2
< 0.1%
500 4
< 0.1%
400 1
< 0.1%
300 1
< 0.1%

lucene
Unsupported

MISSING
REJECTED
UNSUPPORTED

NO DESCRIPTION

Missing 20747
Missing (%) 100.0%
Memory size 162.2 KiB

provider
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

It indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS

Distinct 3
Distinct (%) < 0.1%
Missing 13372
Missing (%) 64.5%
Memory size 162.2 KiB
passive
5948
network
1371
gps
56

Length

Max length 7
Median length 7
Mean length 6.969627119
Min length 3

Characters and Unicode

Total characters 51401
Distinct characters 13
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row passive
2nd row passive
3rd row passive
4th row passive
5th row passive

Common Values

Value Count Frequency (%)
passive 5948
28.7%
network 1371
6.6%
gps 56
0.3%
(Missing) 13372
64.5%

Length

2022-07-04T20:03:36.394474 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-07-04T20:03:36.646406 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Value Count Frequency (%)
passive 5948
80.7%
network 1371
18.6%
gps 56
0.8%

Most occurring characters

Value Count Frequency (%)
s 11952
23.3%
e 7319
14.2%
p 6004
11.7%
a 5948
11.6%
i 5948
11.6%
v 5948
11.6%
n 1371
2.7%
t 1371
2.7%
w 1371
2.7%
o 1371
2.7%
Other values (3) 2798
5.4%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 51401
100.0%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
s 11952
23.3%
e 7319
14.2%
p 6004
11.7%
a 5948
11.6%
i 5948
11.6%
v 5948
11.6%
n 1371
2.7%
t 1371
2.7%
w 1371
2.7%
o 1371
2.7%
Other values (3) 2798
5.4%

Most occurring scripts

Value Count Frequency (%)
Latin 51401
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
s 11952
23.3%
e 7319
14.2%
p 6004
11.7%
a 5948
11.6%
i 5948
11.6%
v 5948
11.6%
n 1371
2.7%
t 1371
2.7%
w 1371
2.7%
o 1371
2.7%
Other values (3) 2798
5.4%

Most occurring blocks

Value Count Frequency (%)
ASCII 51401
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
s 11952
23.3%
e 7319
14.2%
p 6004
11.7%
a 5948
11.6%
i 5948
11.6%
v 5948
11.6%
n 1371
2.7%
t 1371
2.7%
w 1371
2.7%
o 1371
2.7%
Other values (3) 2798
5.4%

speed
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

The speed of the device, measured in meters/second over ground

Distinct 165
Distinct (%) 2.2%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean -0.0988759322
Minimum -1
Maximum 23.27
Zeros 951
Zeros (%) 4.6%
Negative 6203
Negative (%) 29.9%
Memory size 162.2 KiB
2022-07-04T20:03:36.895875 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -1
5-th percentile -1
Q1 -0.01
median -0.01
Q3 -0.01
95-th percentile 0
Maximum 23.27
Range 24.27
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 1.040423477
Coefficient of variation (CV) -10.52251497
Kurtosis 187.1391202
Mean -0.0988759322
Median Absolute Deviation (MAD) 0
Skewness 11.65956569
Sum -729.21
Variance 1.082481012
Monotonicity Not monotonic
2022-07-04T20:03:37.179318 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-0.01 4832
23.3%
-1 1371
6.6%
0 951
4.6%
1.04 4
< 0.1%
1.49 4
< 0.1%
1.15 3
< 0.1%
0.77 3
< 0.1%
1.66 3
< 0.1%
1.73 3
< 0.1%
0.46 3
< 0.1%
Other values (155) 198
1.0%
(Missing) 13372
64.5%
Value Count Frequency (%)
-1 1371
6.6%
-0.01 4832
23.3%
0 951
4.6%
0.01 1
< 0.1%
0.02 1
< 0.1%
0.03 2
< 0.1%
0.06 1
< 0.1%
0.07 2
< 0.1%
0.12 1
< 0.1%
0.15 3
< 0.1%
Value Count Frequency (%)
23.27 1
< 0.1%
21.62 1
< 0.1%
21.54 1
< 0.1%
19.35 1
< 0.1%
16.9 1
< 0.1%
16.77 1
< 0.1%
16.15 1
< 0.1%
15.69 1
< 0.1%
15.45 1
< 0.1%
15.19 1
< 0.1%

bearing
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

The compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)

Distinct 240
Distinct (%) 3.3%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 4.349530847
Minimum -1
Maximum 354.34
Zeros 924
Zeros (%) 4.5%
Negative 6203
Negative (%) 29.9%
Memory size 162.2 KiB
2022-07-04T20:03:37.468124 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -1
5-th percentile -1
Q1 -1
median -1
Q3 -1
95-th percentile 0
Maximum 354.34
Range 355.34
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 33.15126212
Coefficient of variation (CV) 7.621801818
Kurtosis 57.33679434
Mean 4.349530847
Median Absolute Deviation (MAD) 0
Skewness 7.341107774
Sum 32077.79
Variance 1099.00618
Monotonicity Not monotonic
2022-07-04T20:03:37.752132 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-1 6203
29.9%
0 924
4.5%
79.62 4
< 0.1%
87.1 3
< 0.1%
233.4 2
< 0.1%
74.93 2
< 0.1%
272.9 2
< 0.1%
105.36 2
< 0.1%
301.4 2
< 0.1%
278.16 1
< 0.1%
Other values (230) 230
1.1%
(Missing) 13372
64.5%
Value Count Frequency (%)
-1 6203
29.9%
0 924
4.5%
2.76 1
< 0.1%
4.6 1
< 0.1%
6.97 1
< 0.1%
7.17 1
< 0.1%
7.56 1
< 0.1%
10.02 1
< 0.1%
10.2 1
< 0.1%
10.86 1
< 0.1%
Value Count Frequency (%)
354.34 1
< 0.1%
349.4 1
< 0.1%
349.35 1
< 0.1%
346.7 1
< 0.1%
341.42 1
< 0.1%
340.66 1
< 0.1%
339.1 1
< 0.1%
335 1
< 0.1%
333.89 1
< 0.1%
333.2 1
< 0.1%

latitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Geographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.

Distinct 252
Distinct (%) 3.4%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 3.820344963
Minimum -25.3079
Maximum 51.532
Zeros 0
Zeros (%) 0.0%
Negative 4390
Negative (%) 21.2%
Memory size 162.2 KiB
2022-07-04T20:03:38.063416 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -25.3079
5-th percentile -25.2937
Q1 -25.2937
median -25.2773
Q3 47.3861
95-th percentile 47.39449
Maximum 51.532
Range 76.8399
Interquartile range (IQR) 72.6798

Descriptive statistics

Standard deviation 35.35295973
Coefficient of variation (CV) 9.253865836
Kurtosis -1.834726377
Mean 3.820344963
Median Absolute Deviation (MAD) 0.0164
Skewness 0.3959807018
Sum 28175.0441
Variance 1249.831762
Monotonicity Not monotonic
2022-07-04T20:03:38.339382 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-25.2937 2669
12.9%
47.3861 1423
6.9%
-25.2773 864
4.2%
-25.2938 257
1.2%
47.3878 251
1.2%
37.177 241
1.2%
-25.2936 232
1.1%
47.3862 203
1.0%
47.386 133
0.6%
47.4913 132
0.6%
Other values (242) 970
4.7%
(Missing) 13372
64.5%
Value Count Frequency (%)
-25.3079 1
< 0.1%
-25.3025 1
< 0.1%
-25.299 1
< 0.1%
-25.2966 1
< 0.1%
-25.2959 1
< 0.1%
-25.2955 1
< 0.1%
-25.2952 1
< 0.1%
-25.2951 2
< 0.1%
-25.2947 2
< 0.1%
-25.2945 1
< 0.1%
Value Count Frequency (%)
51.532 10
< 0.1%
51.5319 1
< 0.1%
51.5318 2
< 0.1%
51.5317 1
< 0.1%
51.5311 1
< 0.1%
51.531 1
< 0.1%
51.5309 1
< 0.1%
51.5308 2
< 0.1%
51.5307 1
< 0.1%
51.5306 1
< 0.1%

longitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Geographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation

Distinct 303
Distinct (%) 4.1%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean -30.81108098
Minimum -57.6376
Maximum 19.0704
Zeros 0
Zeros (%) 0.0%
Negative 4703
Negative (%) 22.7%
Memory size 162.2 KiB
2022-07-04T20:03:38.641308 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -57.6376
5-th percentile -57.6374
Q1 -57.6365
median -57.5569
Q3 9.2854
95-th percentile 9.2857
Maximum 19.0704
Range 76.708
Interquartile range (IQR) 66.9219

Descriptive statistics

Standard deviation 32.63969345
Coefficient of variation (CV) -1.059349183
Kurtosis -1.802222895
Mean -30.81108098
Median Absolute Deviation (MAD) 0.0796
Skewness 0.411562126
Sum -227231.7222
Variance 1065.349588
Monotonicity Not monotonic
2022-07-04T20:03:38.931241 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-57.6365 2138
10.3%
9.2854 1338
6.4%
-57.5569 924
4.5%
-57.6364 530
2.6%
-57.6374 425
2.0%
-3.6055 234
1.1%
9.2699 131
0.6%
19.0682 130
0.6%
9.2853 125
0.6%
9.27 120
0.6%
Other values (293) 1280
6.2%
(Missing) 13372
64.5%
Value Count Frequency (%)
-57.6376 4
< 0.1%
-57.6375 51
0.2%
-57.6374 425
2.0%
-57.6373 8
< 0.1%
-57.6372 2
< 0.1%
-57.6371 2
< 0.1%
-57.637 2
< 0.1%
-57.6369 1
< 0.1%
-57.6368 2
< 0.1%
-57.6367 6
< 0.1%
Value Count Frequency (%)
19.0704 1
< 0.1%
19.0682 130
0.6%
19.0681 5
< 0.1%
19.068 19
0.1%
19.0676 1
< 0.1%
19.0674 2
< 0.1%
19.0668 1
< 0.1%
19.0664 1
< 0.1%
19.0661 1
< 0.1%
19.066 2
< 0.1%

altitude
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Elevation above sea level in meters.

Distinct 1168
Distinct (%) 15.8%
Missing 13372
Missing (%) 64.5%
Infinite 0
Infinite (%) 0.0%
Mean 103.595655
Minimum -151.241
Maximum 968.2063
Zeros 2
Zeros (%) < 0.1%
Negative 6205
Negative (%) 29.9%
Memory size 162.2 KiB
2022-07-04T20:03:39.236732 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum -151.241
5-th percentile -1
Q1 -1
median -1
Q3 -1
95-th percentile 833.32634
Maximum 968.2063
Range 1119.4473
Interquartile range (IQR) 0

Descriptive statistics

Standard deviation 267.7728296
Coefficient of variation (CV) 2.58478823
Kurtosis 3.348709062
Mean 103.595655
Median Absolute Deviation (MAD) 0
Skewness 2.291766181
Sum 764017.9553
Variance 71702.28826
Monotonicity Not monotonic
2022-07-04T20:03:39.532741 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
-1 6203
29.9%
0 2
< 0.1%
111.7 2
< 0.1%
34 2
< 0.1%
33.9 2
< 0.1%
151.5919 2
< 0.1%
855.8237 1
< 0.1%
838.4717 1
< 0.1%
803.6946 1
< 0.1%
846.7034 1
< 0.1%
Other values (1158) 1158
5.6%
(Missing) 13372
64.5%
Value Count Frequency (%)
-151.241 1
< 0.1%
-99.6189 1
< 0.1%
-1 6203
29.9%
0 2
< 0.1%
19.2 1
< 0.1%
28.3755 1
< 0.1%
33.9 2
< 0.1%
34 2
< 0.1%
47.9149 1
< 0.1%
53.6 1
< 0.1%
Value Count Frequency (%)
968.2063 1
< 0.1%
939.2587 1
< 0.1%
937.146 1
< 0.1%
927.8655 1
< 0.1%
923.4714 1
< 0.1%
921.501 1
< 0.1%
920.1565 1
< 0.1%
917.1409 1
< 0.1%
916.9326 1
< 0.1%
916.0374 1
< 0.1%

Interactions

2022-07-04T20:03:30.153341 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:16.339821 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:18.271585 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:20.131663 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:22.092875 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:24.111044 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:26.036977 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:27.958458 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:30.393078 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:16.586403 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:18.509907 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:20.394824 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:22.326816 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:24.355275 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:26.272830 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:28.214275 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:30.628969 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:16.816947 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:18.726907 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:20.632328 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:22.541488 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:24.588078 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:26.500059 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:28.449040 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:30.857509 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:17.054682 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:18.958192 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:20.863504 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:22.973007 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:24.824696 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:26.741051 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:28.694112 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:31.080078 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:17.278818 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:19.176949 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:21.098874 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:23.183719 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:25.048786 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:26.965898 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:28.926365 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:31.333405 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:17.527619 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:19.421663 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:21.340896 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:23.416894 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:25.298250 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:27.210030 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:29.395289 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:31.583252 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:17.777599 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:19.660897 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:21.586126 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:23.649170 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:25.550842 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:27.457449 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:29.650076 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:31.833921 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:18.029716 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:19.902088 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:21.849820 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:23.884712 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:25.801239 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:27.716440 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
2022-07-04T20:03:29.907111 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-07-04T20:03:39.770069 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient ( ρ ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r . It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y , one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-07-04T20:03:40.284811 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient ( r ) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r .

To calculate r for two variables X and Y , one divides the covariance of X and Y by the product of their standard deviations.
2022-07-04T20:03:40.615273 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient ( τ ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y , one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-07-04T20:03:40.903874 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here .
2022-07-04T20:03:41.103016 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here .

Missing values

2022-07-04T20:03:32.193946 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-07-04T20:03:32.723615 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-07-04T20:03:33.143498 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-07-04T20:03:33.497878 image/svg+xml Matplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.